Introduction
This project utilizes data provided by the U.S. Bureau of Labor Statistics (BLS) to analyze employment trends across various states in the U.S. The data includes detailed occupational employment and wage estimates which are essential for understanding labor market dynamics.
Data Source
The data for this analysis is sourced from the U.S. Bureau of Labor Statistics through their public API. Detailed information about how to access this data and the structure of the API can be found here.
These estimates are calculated with data collected from employers in all industry sectors, all metropolitan and nonmetropolitan areas, and all states and the District of Columbia. The top employment and wage figures are provided above. The complete list is available in the downloadable XLS files.
The percentile wage estimate is the value of a wage below which a certain percent of workers fall. The median wage is the 50th percentile wage estimate—50 percent of workers earn less than the median and 50 percent of workers earn more than the median. More about percentile wages.
Estimates for detailed occupations do not sum to the totals because the totals include occupations not shown separately. Estimates do not include self-employed workers.
Annual wages have been calculated by multiplying the hourly mean wage by a “year-round, full-time” hours figure of 2,080 hours; for those occupations where there is not an hourly wage published, the annual wage has been directly calculated from the reported survey data.
The relative standard error (RSE) is a measure of the reliability of a survey statistic. The smaller the relative standard error, the more precise the estimate.
Estimate not released.
The location quotient is the ratio of the area concentration of occupational employment to the national average concentration. A location quotient greater than one indicates the occupation has a higher share of employment than average, and a location quotient less than one indicates the occupation is less prevalent in the area than average.
Example pull API
Data Acquisition and Preparation
Function to Fetch Data
This function is designed to retrieve data based on a series of IDs and state names:
Show the code
# Function to get data based on seriesid and state name
get_state_data <- function(seriesid_list, state_name, api_key, url) {
# Prepare the payload
payload <- list(
seriesid = seriesid_list,
startyear = "2023",
endyear = "2023",
registrationkey = api_key
)
# Make the POST request
response <- POST(url = url,
body = payload,
content_type("application/json"),
encode = "json")
# Parse the response
list_obj <- content(response, "text") %>%
jsonlite::fromJSON()
# Convert response to dataframe
df <- map_dfr(list_obj$Results$series$data, ~ tibble(
year = .x[[1]],
period = .x[[2]],
period_type = .x[[3]],
is_annual = .x[[4]],
value = .x[[5]]
))
# Define column names
column_names <- c("Employment (1)",
"Employment per thousand jobs",
"Location quotient (9)",
"Hourly mean wage",
"Annual mean wage (2)")
# Mutate dataframe to include series titles
df <- df %>%
mutate(series_title = column_names)
# Pivot dataframe and add state information
state_df <- df %>%
pivot_wider(
names_from = series_title,
values_from = value
) %>%
mutate(State = state_name) %>%
select(State, `Employment (1)`, `Employment per thousand jobs`, `Location quotient (9)`, `Hourly mean wage`, `Annual mean wage (2)`)
return(state_df)
}Texas
Utah
New York
Florida
Pennsylvania
California
Combining Tables
Show the code
| State | Employment (1) | Employment per thousand jobs |
|---|---|---|
| California | 33220 | 1.851 |
| Texas | 20560 | 1.517 |
| New_York | 16390 | 1.745 |
| Florida | 8400 | 0.878 |
| Pennsylvania | 7490 | 1.259 |
| Utah | 3530 | 2.105 |
| Location quotient (9) | Hourly mean wage | Annual mean wage (2) |
|---|---|---|
| 1.46 | 67.54 | 140490 |
| 1.20 | 52.40 | 109000 |
| 1.38 | 64.82 | 134830 |
| 0.69 | 51.20 | 106490 |
| 0.99 | 49.22 | 102370 |
| 1.66 | 48.93 | 101780 |
Analysis and Visualization
Data visualization helps to highlight key insights and trends from the employment data:
Employment Data by State
Show the code
ggplot(df_long, aes(x = State, y = Value, fill = State)) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
labs(title = "Employment Data by State",
x = NULL,
y = "Value") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_brewer(palette = "Set1") +
facet_wrap(~ Metric, scales = "free_y", ncol = 1)Show the code
p <- ggplot(df_long, aes(x = State, y = Value, fill = State,
text = paste("State:", State, "<br>Value:", Value, "<br>Metric:", Metric))) +
geom_bar(stat = "identity", position = position_dodge(width = 0.9)) +
labs(title = "Employment Data by State", x = NULL, y = "Value") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_fill_brewer(palette = "Set1") +
facet_wrap(~ Metric, scales = "free_y", ncol = 1)
ggplotly(p, tooltip = "text")Conclusion
This analysis provided insights into employment patterns across different states in the U.S., highlighting variations in employment rates, wage levels, and occupational distribution. Understanding these trends is vital for policymakers, economists, and the general public to make informed decisions about employment and economic strategies.